-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix UUID comparisons to conform to IETF RFC 4122 #23847
Conversation
347c794
to
e66c3dc
Compare
Thanks for the release note entry! A couple of nit suggestions to follow the phrasing in the Order of changes in the Release Notes Guidelines, add the PR number, and consider adding a link to the RFC. I found this link I used in the suggestion, if you have a better link please use it instead.
|
e66c3dc
to
1c0504e
Compare
1c0504e
to
ffbccc5
Compare
Thanks as usual for your suggestions @steveburnett 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and thanks for the great writeup @ZacBlanco . I also tested these changes along with the native changes at facebookincubator/velox#10791 and the results match.
@ZacBlanco is it possible to better document the format of a Presto UUID, as it is stored in a |
@tdcmeehan can we merge? |
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. This ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
@BryanCutler following up with a PR to add javadoc here: #23961 |
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
This adds UUID comparison functions that were previously unsupported. Functions added are <, >, <=, >=. Equal was already supported. The ordering is done lexicographically and conforms to IETF [RFC 4122]. The ordering also matches Presto Java after #[23847]. [RFC 4122]: https://datatracker.ietf.org/doc/html/rfc4122.html [23847]: prestodb/presto#23847
Description
Reverse UUIDs bytes internally to make comparison operators conform to IETF RFC 4122
Motivation and Context
The presto documentation states that we support UUIDs and conform to RFC 41221:
Before this change, UUIDs were read in as two longs in big endian format and used the that byte order for comparisons. However, the java bytewise comparison in our
io.airlift.slice
dependency assumes the backing values are in little endian format, so the bytes are swapped during the comparison2. This made comparisons operators between UUIDs incorrect according the RFC 41223 §3 under "Rules for Lexical Equivalence" on p.5.Note that RFC 41223 has an errata in the paragraph describing the lexicographic comparison in which the original text is inconsistent. The corrected text can be found in the errata EID 14284 and is reproduced above for easier reference.
Additionally, RFC 9562 has been published which, supersedes 4122. It defines the sorting rules as a simple byte-wise comparison in §6.115
For example, before this change, the following comparison of UUIDs would result in a
TRUE
result:This seems to be incorrect because the reading of the
00000000-0000-0000-1000-000000000000
UUID (say, UUID "A") appears to have a1
byte in a more significant position than the1
byte in the UUID00000000-0000-0000-0000-000000000001
(say, UUID "B"). Because A has a 1 byte in a more significant position than B, this comparison should evaluate to FALSE.In addition, when testing this same comparison in postgres (for which we also support the UUID type), postgres returns results which are inconsistent with Presto.
Additional verification on ordering from postgres
result
Impact
Test Plan
Contributor checklist
Release Notes
Footnotes
https://prestodb.io/docs/0.289/language/types.html#uuid-type ↩
https://github.com/airlift/slice/blob/8f0494bdaad91f0c57f03e09aad2d77f955cfe42/src/main/java/io/airlift/slice/Slice.java#L1331-L1340 ↩
https://datatracker.ietf.org/doc/html/rfc4122#section-3 ↩ ↩2
https://www.rfc-editor.org/errata/eid1428 ↩
https://datatracker.ietf.org/doc/html/rfc9562#section-6.11 ↩